Vocal tract normalization as linear transformation of MFCC

نویسندگان

  • Michael Pitz
  • Hermann Ney
چکیده

We have shown previously that vocal tract normalization (VTN) results in a linear transformation in the cepstral domain. In this paper we show that Mel-frequency warping can equally well be integrated into the framework of VTN as linear transformation on the cepstrum. We show examples of transformation matrices to obtain VTN warped Mel-frequency cepstral coefficients (VTN-MFCC) as linear transformation of the original MFCC and discuss the effect of Mel-frequency warping on the Jacobian determinant of the transformation matrix. Finally we show that there is a strong interdependence of VTN and Maximum Likelihood Linear Regression (MLLR) for the case of Gaussian emission probabilities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalization

In this paper, we first show that accounting for Jacobian in Vocal-Tract Length Normalization (VTLN) will degrade the performance when there is a mismatch between the train and test speaker conditions. VTLN is implemented using our recently proposed approach of linear transformation of conventional MFCC, i.e. a feature transformation. In this case, Jacobian is simply the determinant of the line...

متن کامل

Integrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification

This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral fea...

متن کامل

Speaker normalization in the MFCC domain

It has been shown in several recent publications that application of vocal tract normalization (VTN) is a successful method for improving the accuracy of speaker independent recognisers. We argue that VTN can be implemented in the filterbank domain and propose a model to achieve this. We show how the model can be implemented directly in the MFCC domain, where it may be viewed as a constrained v...

متن کامل

Investigations on linear transformations for speaker adaptation and normalization

This thesis deals with linear transformations at various stages of the automatic speech recognition process. In current state-of-the-art speech recognition systems linear transformations are widely used to care for a potential mismatch of the training and testing data and thus enhance the recognition performance. A large number of approaches has been proposed in literature, though the connectio...

متن کامل

Analysis of gender normalization using MLP and VTLN features

This paper analyzes the capability of multilayer perceptron frontends to perform speaker normalization. We find the context decision tree to be a very useful tool to assess the speaker normalization power of different frontends. We introduce a gender question into the training of the phonetic context decision tree. After the context clustering the gender specific models are counted. We compare ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003